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The framework of generalized probabilistic theories (GPTs) is a popular approach for studying the physical 
foundations of quantum theory. The standard framework assumes the no-restriction hypothesis, in which the 
state space of a physical theory determines the set of measurements. However, this assumption is not physically 
motivated. We generalize the framework to account for systems that do not obey the no-restriction hypothesis. 
We then show how our framework can be used to describe new classes of probabilistic theories, for example 
those which include intrinsic noise. Relaxing the restriction hypothesis also allows us to introduce a 'self- 
dualization' procedure, which yields a new class of theories that share many features of quantum theory, such as 
obeying Tsirelson's bound for the maximally entangled state. We then characterize joint states, generalizing the 
maximal tensor product. We show how this new tensor product can be used to describe the convex closure of 
the Spekkens toy theory, and in doing so we obtain an analysis of why it is local in terms of the geometry of its 
state space. We show that the unrestricted version of the Spekkens toy theory is the theory known as 'boxworld' 
that allows maximal nonlocal correlations. 



I. INTRODUCTION 

The framework of generalized probabilistic theories (GPTs) 
is a modern operational approach for studying the physical 
foundations of quantum theory The framework is oper- 
ational because a theory is defined according to the observ- 
able measurement statistics that it predicts. In contrast, quan- 
tum theory is usually defined using an abstract mathematical 
formalism without physical motivation (e.g. the density ma- 
trix formalism). Assuming only basic principles, the frame- 
work encompasses a large variety of theories. For example, 
quantum theory and classical probability theory are special 
cases of GPTs. The focus of work on GPTs is to identify 
the unique physical properties that distinguish quantum the- 
ory from other theories. More generally, one can examine 
the relationship between different physical properties, such as 
no-cloning and nonlocality, without restricting to a particular 
physical theory. 

Using this framework, it has been shown that many proper- 
ties that were thought to be particular to quantum theory are 
in fact very general. As a sample of such results, it was shown 
that any non-classical probability theory (in the sense to be 
described in section[ll]i has the following properties: the exis- 
tence of entanglement ||T|; for mixed states, the lack a unique 
decomposition into a unique ensemble of pure states; gener- 
alizations of the no-cloning or no-broadcasting theorem |2|; 
and, an information-disturbance trade-off |3|. Notably, recent 
attempts to reconstruct quantum theory from physical axioms 
include the assumptions made in GPTs |4| or very similar as- 
sumptions 15l|6l. 

A GPT is defined by a set of preparations, a set of measure- 
ments, and composition rules for multipartite systems called 
the tensor product of the theory. In general there is a trade-off 
between possible preparations and possible measurement out- 
comes: the larger the set of preparations, the smaller the upper 
bound on the set of allowed measurements [7 1. In the existing 
GPT framework, it is usually assumed that this upper bound 
is saturated. This means that, for a chosen set of states, all 
potential measurement outcomes that yield probability-valued 



results are assumed to be physically realizable. This is called 
the no- restriction hypothesis |6 |. This assumption is not based 
on any physical motivation, and it is usually assumed for the 
sake of mathematical convenience. In this work we take on 
the task of extending the framework of GPTs when the no- 
restriction hypothesis is abandoned. This extension of GPTs 
therefore brings the framework closer to the operational moti- 
vation for which it was originally initiated. 

Our contribution. The idea of removing the no-restriction 
hypothesis (or replacing it with other assumptions) has ap- 
peared sporadically in other works |6 8|. However, until now 
a systematic analysis of the consequences of doing so has been 
lacking. In this paper we provide a well-defined framework 
with the no-restriction hypothesis omitted, whilst keeping the 
other assumptions of the GPT framework. Our work then pro- 
ceeds in two parts. 

In the first part we show that this new framework encom- 
passes more theories than before. For example, we show that 
theories with intrinsic noise can be described in our frame- 
work, but not in the existing GPT framework. We also pro- 
vide a procedure for constructing a self-dual theory from a 
theory which is not self-dual. The importance of this is that 
self-duality has been shown to imply 'quantum-like' (for ex- 
ample, limiting bipartite nonlocality to Tsirelson's bound for 
the maximally entangled state |9|). Hence this allows us to 
introduce a new class of probabilistic theories with 'quantum- 
like' behaviour, and crucially, this is a class of theories which 
does not satisfy the no-restriction hypothesis. 

In the second part, we develop the treatment of composite 
systems. In particular, we show that our extension requires 
a new (and more general) definition of the tensor product for 
describing composite systems. This significantly extends the 
GPT framework, since it allows us to analyse the relation- 
ship between nonlocality and the geometry of the state space 
of a theory, building on previous work in this direction. For 
example, we show how the Spekkens toy theory (for which 
the connection to GPTs had not been previously established) 
can be viewed as a GPT, but only in our more general frame- 
work. Moreover, this allows us to give an analysis of why the 
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Spekkens theory is local, using the geometry of its state space. 

Structure of the paper In section [ll] we give a brief 
overview of the framework of GPTs. We then begin the first 
part of our analysis, concentrating on single systems. In sec- 
tion III we describe in detail the no-restriction hypothesis, and 
some consequences of relaxing this assumption. In section 
IV we develop the important example of theories with noise. 
In section |V] we introduce the self-dualization procedure, and 
discuss the class of theories that this introduces. We then en- 
ter the second part of our analysis, which concerns composite 
systems. In section VI we explain how joint states of com- 



posite systems are usually described. In section VII we show 



why a new definition of composite systems is needed, and we 
introduce this definition. We then study examples of theories 
such as the Spekkens model. 



II. GENERALIZED PROBABILISTIC THEORIES: A 
BRIEF SUMMARY 

A physical experiment consists of the following steps: the 
preparation of a system, transformations of that system (e.g. 
by inherent dynamics), and a measurement. In general, the 
measurement will different outcomes, each occurring with 
some probability. Defining a generalized probabilistic theory 
amounts to specifying these probabilities for any such combi- 
nation of preparation, transformation and measurement. Note 
that transformations can be absorbed into either the prepara- 
tion or the measurement. Hence to define the allowed prob- 
ability distributions of a GPT, it suffices to define the set of 
preparation procedures and the set of measurements. 



1. States and effects 

Consider a class of preparation procedures which all yield 
exactly the same measurement statistics. The members of this 
class are experimentally indistinguishable. Since a GPT con- 
cerns only experimental statistics, we can define a state of a 
system as such an equivalence class. Analogously we also 
define an effect as an equivalence class of measurement out- 
comes. We will refer to this identification of states and effects 
with their respective measurement statistics as the equivalence 
principle. Mathematically, states are represented by elements 
of a vector space V. Effects are linear functionals on states, 
i.e. elements of the dual space V* . Applying an effect e to 
a state O) yields the probability p{e\(o) = e{(£i) for the corre- 
sponding measurement outcome to occur when measuring the 
system in the state. Without loss of generality we will choose 
a specific representation of states and effects in this paper to 
demonstrate the abstract concepts. Both states and effects will 
be represented by vectors embedded in M". The application of 
effects on states is given by the Euclidean inner product of the 
respective vectors: 

e = (ei,--- ,e„)^ « = (wi,--- ,w„)^ (1) 
p{e\(o) = e^- (O = Y,£iWi (2) 



The GPT framework also accounts for ensembles of prepa- 
rations or measurements, in which there is uncertainty about 
which measurement is implemented, or which state has been 
prepared. This could occur if there is a probabilistic selection 
of the preparation procedure, for example. This probability 
distribution is represented by using mixed states and mixed 
effects, given by convex combinations: 

^,>0,^A, = 1 (3) 

i i 

(0 = Y,iJ.j (Oj Hi > 0, 1 (4) 

(■ /■ 

corresponding to ensembles {A,,,e,} and Conse- 
quently, states and effects form convex sets. If the only convex 
decomposition of a state (O is such that co °^ (Oi for all /, then 
the state is a pure state. Similarly, if the only convex decom- 
position of an effect e using Eq. [Tjis such that e oc a for all /, 
then the effect is a pure effect. 

Since effects and states act linearly on each other, the proba- 
bility distribution for the ensembles is the weighted sum of the 
probabilities pij = ei{cOj) of individual ensemble elements: 



(5) 



More generally, consider the result of applying different mea- 
surements to systems prepared by the same method. In gen- 
eral, there will be measurement outcomes with probabilities 
that are linearly dependent for a fixed state. Analogously, one 
might find linear dependencies between the probabilities for 
a fixed measurement outcome under variations of the state 
that is prepared. This implies a linear dependence between 
the vector space elements (O £V representing the states; there 
is a corresponding linear dependence for the effects e gV*. 
This determines the dimension of V as the minimal number 
of different measurement outcomes needed to identify a state 
uniquely (this is called the 'fiducial set' of measurement out- 
comes by Hardy |10|). In this paper we restrict ourselves to 
systems for which the vector space V has finite dimension. 
Hence the dimension of V is equal to the dimension of the 
dual space V*, which is the minimal number of preparations 
required to identify an effect. 



2. Normalization and measurements 

A central concept in the GPT framework is the description 
of perfect preparations and measurements. A perfect prepara- 
tion is one that is guaranteed to succeed. It is represented by a 
normalized state, where normalization defined with respect to 
a special effect, called the unit measure u. The set of all nor- 
malized states is called the state space Q.. The unit measure u 
represents an unbiased measurement with only one outcome: 
this outcome occurs if a preparation has succeeded, i.e. it is 
determined by 



u{(o) = 1 Vco e n. 



(6) 
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In the specific representation used in this paper we choose 
m:=(0,---,0,1)^. 

Consequently, for a state (O embedded in an n-dimensional 
vector space V, the normaHzation of (O is directly apparent 
from the last component (0„, i.e. normalized states have co„ = 
1. 

An effect is a map e : Q. [0,1] that gives a probability 
when applied to a normalized state CD. A perfect measurement 
consists of a set of effects {e;} which sum up to the unit mea- 
sure, i.e.: 

Y^ei = u. 

i 

Thus, measurement probabilities sum up to one for any 
perfectly-prepared system. 

Beyond the description of perfect preparations and mea- 
surements, the GPT framework also accounts for the opposite 
extreme, namely preparations that always fail or measurement 
outcomes that never occur no matter which state they are ap- 
plied to. The corresponding states and effects are given by the 
zero elements of V and V* with 

0(c(j) = O VcoeV (7) 
e(0)=O VeeV*. (8) 

Imperfections in preparations yield unnormalized states re- 
sulting from the mixture of a normalized state o and 0. De- 
tector deficiencies and bias can be addressed by mixing ev- 
ery effect of a perfect measurement with or another com- 
mon effect. However, we will show in section fVI 3| that con- 
sistency conditions on joint states forbid imperfect measure- 
ments. Consequently, the measurement has to be completed 
by an additional effect, such that the effects sum up to the unit 
measure, even though the occurrence of this additional mea- 
surement outcome cannot be registered by an experimenter 
due to detector deficiencies. 



3. Equivalent Representations 

Consider applying arbitrary bijective linear maps on all 
effects and the corresponding inverse map L^' on all states. 
This leaves the results from any combination of effects and 
states invariant, since: 

{L^-e)[L-^-(o\ ^ {L^-ef-L-^-(0^e^-L-L-^-co^e^-CO. 

(9) 

Now, a particular probabilistic theory is associated with a par- 
ticular state space H and set of effects E. But theories are 
distinguished only by the different measurement statistics that 
are possible (as is guaranteed by using the equivalence princi- 
ple). Hence if Q. and E are transformed according to (|9]), then 
the resulting ii' and E' define the same theory, since this trans- 
formed state space and effect set yield the same measurement 
statistics. 



4. Examples 

Quantum theory. Consider the usual quantum formalism, 
for which a state is given by a density matrix p on a Hilbert 
space By decomposing density matrices in an operator 
basis, we obtain the real vector space V defined above for 
quantum theory. For example, there is a well-known repre- 
sentation of the normalized states of a qubit as a linear com- 
bination of the Pauli-operators (7, : 

p = ^(l+flCTv + Z7CTv + c(7j.) a^+b^+c->l (10) 

Forming a real vector from the coefficient a, b, c gives the 
representation of the qubit state space in V = M^: this is the 
Bloch ball. 

Adding a fourth component that indicates normalization 
gives a representation similar to ([T]). However, for quantum 
systems of higher dimension the characterization of the geo- 
metrical shape of the state spaces in this representation is still 
an open problem [11]. 

In the usual density matrix representation an effect is a 
POVM element E, which is applied via the trace rule, so that 
the probability of an effect E given the state p is given by 
Tr[£' op]. The unit measure u is given by the identity operator 
1 on ,3^, so that a density matrix p is normalized when: 

Tr[lop] = 1. 

Note that for quantum systems the set of states and the set of 
effects can be identified: this is the set of positive operators on 
,y^. For example, for a qubit the Bloch ball represents both 
(normalised) states and effects. This is an example of 'self- 
duality' in a theory; we shall discuss this further in section 

m 

Classical probability theory. The state space of a classi- 
cal system in is a simplex. This is the convex hull of c/ + 1 
pure states (which can be characterized via a condition on lin- 
ear independence). For example, for d =\, the classical state 
space is a geometrically line, which represents a bit. The ex- 
treme points of the line Oq and Oi are the pure states: these 
represent the values or 1 of the bit respectively. The con- 
vex mixtures pOo + {I — p)(£i\ represent states of classical un- 
certainty about the value of the bit. Only one measurement 
outcome is needed to identify the state, e.g. the probability of 
obtaining the value for the bit. For d = 2, the simplex is a 
triangle in M^, which represents a trit; and so on. As for a bit, 
for any d the pure states CO, represent mutually exclusive prop- 
erties of the system. For example, if one knows with certainty 
which number is on top of a die, then one automatically knows 
that none of the other numbers is on top. This means that the 
pure effects then correspond to measurement outcomes that 
perfectly distinguish o, i.e. e,(a)/) = 5,/. 

Boxworld. This is a popular toy theory in the GPT frame- 
work that is neither quantum nor classical, which was first 
introduced systematically in [1 1. Boxworld consists of a class 
of single systems characterized by the dimension d >2of the 
state space. For d ~ 2 the normalized state space Q. is the 
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FIG. 1. The construction of the effect set E in the traditional GPT framework with no-restriction hypothesis is shown in the middle. Without 
the no-restriction hypothesis the definition of the effect set gets a independent part of the theory specification (right picture). 



convex hull of the following pure states: 

0)1 = (1,0,1)^ a)2-(0,l,l)^ (11) 
0)3 -(-1,0,1)^ 0)4- (0,-1,1)^, (12) 

and so geometrically £2 is a square. The set of effects is given 
by the convex hull of of = (0,0,0)^, u = (0,0, 1)^ and the 
following extremal effects: 

e^ = \{\^^f e2 = \{-lAAY (13) 

ei = \[-\,-\,\f 64 = ^(1,-1,1)^ (14) 

It is straightforward to show that the measurement statistics 
of the two orthogonal binary measurements M\ = {ei,e3} 
and M2 = {62,64} give enough information to identify any 
state. Indeed, due to the normalization constraint the mea- 
surement statistics of the binary measurements on normalized 
states is determined by the probabilities pi, p2 for the first 
outcomes ei, 62- The different states give rise to the full range 
[pXiPi) G [0, 1]^ of possible probability distributions, with the 
probabilities p\ and p2 being independent. Hence the mea- 
surement outcomes ei and 62 are enough to identify the state 
of the system, which verifies that the dimension is li = 2. Note 
that unlike orthogonal measurements in quantum theory (such 
as Ox and d, ), there is no uncertainty principle for Mi and M2 
for this system fT2^. For example, although ey and 64 belong 
to orthogonal measurements, we have ei(o)i) = 64(0)1) = 1. 

Higher dimensional single systems with li > 2 in boxworld 
have d different binary orthogonal measurements and state 
spaces given by hypercubes. For the joint states that we 
shall discuss in section[VTl boxworld allows maximal nonlocal 
correlations (using the CHSH inequality introduced below). 
These correlations define the Popescu-Rohrlich box lfT3l . and 
they are not realizable by quantum theory. 

III. THE NO-RESTRICTION HYPOTHESIS 

We now consider in detail the no-restriction hypothesis, and 
the consequences of relaxing it. 

A. Defining the set of effects 

Effects are restricted to give values in the range of [0, 1] 
when applied to normalized states. But in the traditional 



framework of GPTs, the set of effects E is not restricted any 
further. That is, the set of effects is exactly the set of all 
probability-valued linear functionals on the given states. We 
will call this relationship between states and effects the no- 
restriction hypothesis, in accordance with |6|. It is satisfied 
for classical probability theory and quantum theory. 

Theorem 1. The set of effects under the no-restriction hypoth- 
esis is given by 

E:=V*n{u-Vl) (15) 

with the so-called dual cone 

y; :={eey*|e(0)) >0 Vo)ea}. (16) 

Proof. The definition of effects as probability-valued linear 
functionals can be decomposed into two conditions. 

The first condition is that effects have to give non-negative 
results on every element of the state space. For arbitrary el- 
ements e e y* satisfying this condition, the condition is also 
satisfied by the positive ray {A e\X > 0}. Hence, the set obey- 
ing the non-negativity condition is a cone, namely the dual 
cone V*, defined by ^16\ . 

The second condition on effects requires them to give re- 
sults not larger than one, when applied to arbitrary normalized 
states. In other words the results have to be one or one minus 
a positive value, i.e. e e u — V^. 

In the standard framework both boundary conditions are 
saturated. That is, for a given state space, any linear func- 
tional that gives probability-valued results for all normalized 
states is included in the theory. Thus, the set of effects E is 

y;n(M-y;). □ 

The dual of the dual cone is the primal cone V+, which is 
generated by unnormalized states, i.e. 

y+:={Ao)|o)e£2,A>0} = (y;)*. (17) 

Consequently, if the no-restriction hypothesis holds, then a 
theory is completely determined by the state space, since the 
effect set can be derived from the state space. 

The purpose of this paper is to develop the framework 
of GPTs without the no-restriction hypothesis. There are 
two main reasons for doing so. Firstly, the necessity of the 
no-restriction hypothesis is questionable from an operational 
perspective. Indeed, considering the physical meaning of 
states and effects there is no reason to believe that the possi- 
ble preparation procedures determine possible measurements. 
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Secondly, this will generalize the GPT framework to cover 
new scenarios that have not been accessible within the old 
framework. 



B. Relaxing the no-restriction hypotliesis 

Let us note the constraints that still apply when the no- 
restriction hypothesis is removed. Clearly, effects still need 
to give probabilities when applied to any state. That is, when 
allowing violations of the no-restriction hypothesis, the set of 
probability-valued linear functionals on states in ( fTS) remains 
an upper bound for possible effects. However, in general not 
all elements in this set need to represent a valid measurement 
outcome. Consequently, the set of effects E may actually be 
given by a subset of ([TSj. This is the crucial new ingredient in 
the GPT framework that we shall use in subsequent sections. 

Furthermore, we have identified the following four consis- 
tency conditions that also have to be met: 

i) The unit measure u needs to be included in the restricted 
set as it is crucial for the defirution of measurements. 

ii) For every effect e included in E, the complement effect 
e = u — e needs to be included as well. We will show in 
section 



VI 3 that including an effect, but not the com- 



plement can yield inconsistencies for joint states. 

iii) Coarse graining also provides effects that can be de- 
rived from existing ones. If one does not distinguish 
between some measurement outcomes that are part of 
the same measurement, the common probabilities are 
given by the sum of the individual probabilities. Due to 
linearity the corresponding effect describing the coarse 
graining is given by the sum of the individual effects. 

iv) Transformations map valid states to valid states. How- 
ever, for any transformation T on states, there is also 
an adjoint transformation on effects defined by 
e[T{co)] = [T\e)]{(o) for all states and effects. Thus, 
the effect set has to respect given transformations. 

Apart from these consistency restrictions, the definition of 
the effect set E is now an independent part of the specification 
of the theory. In other words, the effect set E does not depend 
on the state space now, and the dual cone is irrelevant for 
single systems. However, we will see in section [VIIB| that we 
still need it to classify consistent joint states. 

Let us now consider how removing the no-restriction hy- 
pothesis will be useful. As shown above, the no-restriction 
hypothesis connects a set of states and effects via the respec- 
tive dual-cone. Taking a closer look at the dual cone construc- 
tion in ([T6]l, it can easily be seen that each extremal point of 
the primal cone describes a facet of the dual cone and the other 
way round. Therefore, arbitrary small changes in the primal 
cone, can have an enormous impact on the form of the dual 
cone. Consequently, the no-restriction hypothesis makes it 
extremely difficult to alter a theory in a controlled way. How- 
ever, it has always been a central motivation for the frame- 
work of generalized probabilistic theories to find alternatives 
to quantum theory. 




FIG. 2. Inclusion of noise into boxwoiid: State space and effects 
are both embedded into R-' and shown from above for illustration. 
The state space (blue) is given by a square. The effect set is the 
octahedron spanned by the extremal effects e/, u and 0. The noisy 
theory has a restricted effect set with extremal effects . 



We shall now show in sections |IV] and |V] that new mod- 
els with interesting features can indeed be constructed when 
accepting violations of the no-restriction hypothesis. Further- 



more, for joint systems, we will see in sections VI and VII 
how consistency conditions are affected. 



IV. THEORIES WITH INTRINSIC NOISE 

The no-restriction hypothesis guarantees that for any pure 
state 0), there is an effect e, with e ^ u, such that e{(o) = 1. 
In contrast, removing the no-restriction hypothesis allows for 
the modeling of systems with intrinsic noise, i.e. systems for 
which the unit measure is the only certain outcome for any 
state. For example, an isotropic unbiased implementation of 
noise can be achieved by restricting the effects to a set where 
the original extremal effects are replaced by mixtures with u /2 
(except for and u itself). In order to combine noise and bias 
one can mix the extremal with another effect instead of m/2. 

The inclusion of intrinsic noise by a modification of box- 
world is illustrated in Fig. [2] The state space of a single sys- 
tem is given by a square. In the traditional model the effect 
set is determined by the no-restriction hypothesis. A noisy 
version of boxworld is given by mixing the extremal effects e, 
with m/2: 



(18) 



The strength of noise is given by (1 — A), i.e. the maximal 
probability from extremal effects is X. 

This model is particularly interesting with respect to its po- 
tential non-local correlations in joint systems. This will be 
examined in more detail after introducing joint states in sec- 
tion |VT] 



V. SELF-DUALIZATION PROCEDURE 

A particular class of systems that has gained a lot of interest 
recently are so-called (strongly) self-dual systems 121 l9l [HI . 
These are systems with a particular geometrical structure. 
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FIG. 3. Self-dualization of a hexagon system: The pictures show the statespace (blue) and the intersection of the effect cone (red) that lies in 
the same plane. In the first step the state cone will be embedded into the effect cone by an equivalence transformation In the second step 
the effects not included in the state cone are abandoned. 



shared by both classical probability theory and quantum the- 
ory. For strongly self-dual systems states and effects can be 
identified with each other and thus be represented by the same 
mathematical objects. E.g. in quantum theory both states and 
effects are represented by positive hermitian operators. 

Formally, strong self-duality is given by the following defi- 
nition. 

Definition 2. A system is strongly self-dual iff there exists 
an isomorphism <I> : i— >■ V+ giving rise to a corresponding 
symmetric bilinear form T with T{e,f) — e[<I>(/)] = T{f,e) 
and T{e,e) > for all e,/ G V*. 

That is, T provides a semi inner product on effects. In a 
similar way for strongly self-dual systems the inverse map 
<I>^' leads to a semi inner product on states. 

Strong self-duality greatly restricts the class of possible sys- 
tems. As we describe below, the property of 'bit-symmetry' 
implies that a system is strongly self-dual [ 14J, and there is ev- 
idence that non-local correlations of self-dual systems are lim- 
ited |9|. In this section we provide a general construction rule 
to modify any system, such that it resembles the behaviour of 
strongly self-dual systems. 

Tlieorem 3. Any theory in the GPT framework can be mod- 
ified to resemble strongly self-dual systems respecting Defi- 
nition PI with the dual cone replaced by a truncated cone 

r*. 

Proof. Using our representation, we assume an embedding of 
effects and states in a common vector space with a scalar prod- 
uct mediating the application of effects on states, as in Eq.[T] 
We start from an arbitrary theory for which the no-restriction 
hypothesis holds. The freedom of linearly transformations 
from (|9| allows us to strictly enlarge the effect cone V^, while 
the corresponding inverse L^^ constricts the cone of unnor- 
malized states V+ to be strictly smaller Hence, one can al- 
ways represent the same physical theory, with V+ embedded 
in V^. We can then define a truncated the effect cone from 
y* CVl, such that i^l coincides with the state cone V+. 
Hence we can describe unnormalized effects and states with 
the same set of vectors. Consequently, the restriction of ef- 
fects yields the vector space's scalar product to act as an inner 
product between states. This satisfies the definition of strong 
self-duality with the dual cone exchanged for the truncated 



effect cone . The set of effects is then constructed from 

hy E = r*r\u--r*. □ 

The connection between self-dualized systems and actual 
strongly self-dual systems is not only limited to a mere formal 
resemblance. In fact, the following example shows that self- 
dualized systems have features that strongly self-dual systems 
have when the no-restriction hypothesis is assumed. 



A. Example: self-dualized polygons 

Let us illustrate the self-dualization procedure on a set of 
systems introduced in a previous paper |9|. It is defined by 
two-dimensional state spaces with the shape of regular poly- 
gons. While the cases with an odd number of vertices n are 
strongly self-dual, the even cases are not. 

For fixed n, let i2 be the convex hull of n pure states {o,}, 
i^ 1, ...,n, with 



(Oi^ I r„sm{^) I e. 



(19) 



where r„ = ^ sec(7r/n). 
The unit effect is 



(0,0,1)^ 



(20) 



The set E(Q^ of all possible measurement outcomes will 
be determined by the no-restriction hypothesis. In the case of 
even n, £(£2) is the convex hull of the zero effect, the unit 
effect, and ei , . . . , e„, with 



r„cos(^--^) 



r„ sin( 



(2/-l)7C^ 

1 



(21) 



The odd case yields a different expression for the ray- 
extremal effects 



^'•«C0S(2^)^ 

r„sin(2f ) 



(22) 
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As shown in III the complement effects ij = u — e, of ray 
extremal effects e, are also extremal in the effect set E{Q.). 
Whereas for even n these happen to coincide with e,- = 
^(/+n/2)mod ;i' fc"" odd n the complement effects form additional 
extremal points of E{Q.). In summary, E{D.) is the convex hull 
of the zero effect, the unit effect u, the ray-extremal effects 
ei ... ,e„, and for odd n additionally ei , . . . , e~„. 

In the limit °° both cases converge to a disc that can be 
regarded as the 2D subspace of a qubit. The extremal rays of 
the dual cone of polygon systems with odd number of vertices, 
coincide with the scaled extremal states, i.e. these systems 
are strongly self-dual. However, for polygon system with an 
even number of vertices the primal and dual cones are only 
isomorphic and can be matched by a rotation of ^ . That is, 
the even polygons are not strongly self-dual in the original 
models. We will now self-dualize these even-polygon systems 
using the procedure described in Theorem[3] 



As discussed in section II 3 there is always the freedom to 



apply arbitrary bijective linear maps to all effects and the cor- 
responding inverse map on all states. We use this to shrink 
the state space by r„ 1 to fit in a circumscribed circle of 
radius one. Applying the inverse map to effects results in a ef- 
fect cone with r„ ^ r^. This new effect cone is strictly bigger 
than the cone of unnormalized states. By truncating this effect 
cone, such that the new extremal effect e'- are given by 



-f +^(!+l)mod n) 




2 



the primal cone coincide with the new effect cone generated 
by the restricted effect set. 

Let us demonstrate the self-dualization procedure explic- 
itly, by using the polygon with n = 4 (this is the boxworld 
model). In the first step the pure states and effects are trans- 
formed to the equivalent representation given in ( [TT] i and ( pj) . 
In this representation the effect cone is completely embedded 
in the cone of unnormed states. The actual self-dualization is 
then done by exchanging e, for e'^ = a), /2, shrugging off the 
effects not included in the primal cone. 

For all self-dualized polygon models, another interesting 
feature emerges for the restricted case. Namely, there exists a 
specific pure state (b for each pure state o, such that they can 
be perfectly distinguished by an effect e with e(fi)) = 1 and 
e{G)) = 0. Furthermore, each pair of perfectly distinguishable 
states can be mapped reversibly to any other pair of perfectly 
distinguishable states. This feature is known as bit symmetry, 
and was shown to only hold for strongly self-dual systems in 
the traditional framework 1 14|. 

This demonstrates that the self-dualization procedure can 
actually reproduce properties thought to be specific for actual 
strongly self-dual systems. Note that the mathematical de- 
scription of actual strongly self-dual systems can be complex. 
Using self-dualized systems might be an alternative that helps 
to identify new features of strongly self-dual systems, even 
if one is not interested in the relaxation of the no-restriction 
hypothesis. 



B. Spekkens's toy theory 

In ifTSll Spekkens introduced a toy theory which replicates 
many features of quantum theory. For example, it exhibits a 
no-cloning theorem and a teleportation protocol. The theory is 
not explicitly probabilistic, since outcomes are not explicitly 
assigned probabilities. Instead, a graphical calculus is used. 
Given a state o, the outcome / is only specified to be 'possi- 
ble' or 'impossible'. The Spekkens theory in its original form 
also has no notion of arbitrary convex mixing, i.e. it does not 
have the property for any pair of states a)i and (Oi, there exists 
a state p(0\ + (1 — p)G>2 for all probabilities p E [0, 1]. 

The ability to form convex mixtures is crucial to GPTs, and 
in particular to its operational motivation. Fortunately, there is 
a natural extension of Spekkens theory which is probabilistic 
and which does allow convex mixing (the probabilistic ver- 
sion of this theory was also introduced previously by Hardy in 
IflOl ). The state space £2 of a single system is then the octahe- 
dron. In the representation that we have used, the six extremal 
states (i.e. the pure states) are just given by the co-ordinates 
of the octahedron in M^, with an extra component for normal- 
ization. For example, the four extremal states that form the 















(23) 






(05 



FIG. 4. The state space of the Spekkens model, with the six pure 
states ft), labelled. 

square base of each tetrahedron are identical to the states for 
boxworld (see Fig.Hll. That is, for / = 1, ... ,4 the states are: 



(Oi 



/cos(?f)\ 
sin(2f) 



V 





1 



(24) 



and for ! = 5, 6 the states are 



COi 




(25) 



Now, the dual space of an octahedron is the cube. How- 
ever, in the Spekkens theory, the space of effects is identical 
to the state space: it is also the octahedron depicted in Fig. [4] 
Since the octahedron can be obtained by restricting the cube 
(in the same way that is depicted for the hexagon in Fig.|3]l,we 
see that the Spekkens theory provides an example of a self- 
dualized theory. In particular, the convex probabilistic ver- 
sion of it is obtained using the self-dualization procedure de- 
fined in Theorem |3] and as described above for self-dualized 
polygons. Indeed, as with boxworld, the restricted effects are 
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liven by: 



2 



Hence we see that, at least for single systems, the Spekkens 
theory can be seen as an extension of self-dualized boxworld: 
the state and effect space of the Spekkens theory contain the 
state and effect space respectively of self-dualized boxworld. 
We develop the analysis of joint systems for the Spekkens the- 



ory in Section VII C 3 



We note that the single-system state space is identical to 
that of stabilizer quantum mechanics, for which the only al- 
lowed states are the eigenstates of the Pauli operators, and the 
allowed transformations are the Clifford operations. As dis- 
cussed in ITSl and further in [16|, the Spekkens theory and 
stabilizer quantum mechanics differ in the group of reversible 
transformations that each theory specifies. 



VI. JOINT SYSTEMS IN THE TRADITIONAL GPT 
FRAMEWORK 

In the preceding sections we have not distinguished be- 
tween single systems and joint systems. That is, our discus- 
sion so far (e.g. of self-dualization) has not involved any po- 
tential subsystem structure, whereby a system C can be di- 
vided into subsystems A and B, with each subsystem hav- 
ing well-defined states and effects. In the next section we 
shall consider how relaxing the no-restriction hypothesis af- 
fects composite systems. Before doing so, in this section we 
recall the treatment of joint systems in the traditional frame- 
work, i.e. when the no-restriction hypothesis is assumed to 
hold. 

We will restrict the discussion of joint systems to the bipar- 
tite case with two subsystems, as the generalization of mul- 
tipartite systems is straightforward. Bipartite joint states are 
given by elements of the product space 



yAB ^yA^yB 



(26) 



■A* , 



B* 



respec- 



and joint effects are elements of y^^* 
tively |17|. 

We will represent joint states and joint effects by n x m ma- 
trices, with n = dimV'* = dimV**, m = dimV^ = dimV^*. As 
for single systems, the application of effects on states results 
in the sum of the entry-wise products. This can be elegantly 
written as the Hilbert- Schmidt inner product 



(27) 



'J 



where we write e^- co for the matrix product between the trans- 
pose of matrix e representing the joint effect e'*^ and the ma- 
trix (0 representing the joint state O)'^^. 

To define a composite system for a particular GPT (with 
specified state and effect spaces for individual systems), we 
must define the set of joint states Q.^^ = {o)'*^}, and the set 
of joint effects E'^^ = {e^^}, such that these are consistent 
with the individual systems. If the no-restriction hypothesis 



holds, then, as before, once the set of joint states £1^^ is de- 
fined, the set of effects E^^ is determined. In this situation we 
need only consider the definition of £1'*^ in order to specify 
the behaviour of composite systems. There is much freedom 
in defining Q.'^^, but there are two boundary cases which we 
now discuss. 



1. Lower bound on joint systems 



Consider independently prepared systems A and B with 
states (O^ e £2"^, O)^ e OP . Treating the systems jointly as a 
composite AB, the overall preparation is represented by the 
product state co'^^ = CO^ ®C0^, with co^^ e V^^. However, 
just as classical mixtures are allowed for single systems, for 
joint systems mixtures between product states give valid joint 
states again. This corresponds to the ability of experimenters 
to classically correlate the preparations and measurements of 
the individual systems, e.g. two experimenters can agree on 
specific settings. 

The set of unnormalized states only containing product 
states and their mixtures is known as the minimal tensor prod- 



uct A • 



.B, 



Definition 4. The minimal tensor product is given by 



^AB ^ yAB 



CO 



AB 



Y^li(of(g>(of, (28) 
cof eA+,(of eB+,Xi>0}. 



It is the smallest possible set of unnormalized joint states co'^^ 
that is compatible with given state cones A+ = V^, = Vf 
of subsystems A,B. 

Similar reasorung applies to measurements, and so the set 
of joint effects is lower-bounded by the convex hull of prod- 
uct effects. Importantly, this includes the joint unit measure 
M'^^ — u^ u^, which is uniquely defined due to the equiva- 
lence principle. Hence, normalization of joint states O)^^ is 
represented by the condition M'*^(a)'*^) = 1. This allows us 
to define the bipartite state space ii^f, corresponding to the 
minimal tensor product: 



AB 



{co^^eA, 



^AB ^ yAB 



.B, 



..AB , 



(29) 



03" 



)0)" 



(30) 



(0 



For classical subsystems (i.e. a simplex), the joint states and 
effects defined by the minimal tensor product is sufficient to 
describe joint classical systems. Theories with non-classical 
subsystems, however, allow joint states that cannot be inter- 
preted as a mixture of product states, i.e. entangled states. The 
other extreme to the minimal tensor product allows all possi- 
ble entangled states, as we now show. 
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2. Upper bound on joint systems 



3. Joint states as linear maps 



Everything introduced so far is valid independent of the no- 
restriction hypothesis. This changes now, as we ask for the 
maximal sets of joint states and effects consistent with the 
structure of the single systems. 

First, let us focus on the traditional GPT framework with 
single systems obeying the no-restriction hypothesis. Given 
a specific state space the no-restriction hypothesis determines 
the effects for the single systems. As argued above, the joint 
system should at least incorporate product effects and their 
mixtures. Applying such joint effects to any potential joint 
state (O^^ should give probabilities. In particular this implies 
that the joint states form a subset of the following set of linear 
elements. 

Definition 5. The maximal tensor product is defined as 

Ve'^e£^,e^e£^} (31) 
= {A\®^,B\y. (32) 

It is the largest possible set of unnormalized joint states (O^^ 
that is compatible with given state cones A+, B+ of subsys- 
tems A, B that respect the no-restriction hypothesis. 

Note that the second equality arises just by definition of the 
dual cone ([T6|. Hence, we see that the maximal tensor prod- 
uct for states is given by the maximal set of joint states consis- 
tent with the minimal tensor product for effects. Similarly the 
maximal tensor product for effects is defined as the maximal 
set of joint effects consistent with the minimal tensor product 
for states. Elements in the maximal tensor product, but not in 
the minimal tensor product are called entangled. 

To summarise our constructions in this section: the defini- 
tion of a GPT includes the tensor product, which specifies the 
composition of subsystems. The minimal and maximal tensor 
product are only the extreme cases where the joint state space 
Q.^^ is chosen as smallest or the biggest set compatible with 
the state spaces £2'*, OP of single systems. In general, a GPT 
can be defined to include any set of joint states between those 
extremes. 

For example, the joint state space in quantum theory lies 
strictly between the minimal and maximal tensor product. E.g. 
the partial transposed of density matrices representing entan- 
gled states of two qubits or a qubit and a qutrit are known 
to give invalid states for the quantum tensor product, because 
they are not positive on all entangled effects [18 J. However, 
these states give positive results for separable measurements, 
i.e. they are in the maximal tensor product. Note that these 
states should not be misunderstood as part of quantum the- 
ory, but form a separate toy theory that omits any entangled 
measurements. Nevertheless, the additional states in the max- 
imal tensor product of local quantum systems are useful for 
the study of entanglement in standard quantum theory, as they 
correspond exactly to the set of entanglement witnesses. 



For our generalization of the maximal tensor product, we 
shall use the following conception of joint states. Joint states 
can linearly map effects from one part of the joint system to 
unnormalized states of the other subsystem. This can be con- 
veniently shown in the representation of joint states as matri- 



ces, smce 



{e^®e'')[ 



AB-\ 



Tr 



(33) 



Using associativity of the matrix product, we can interpret 
parts of the expression (e^) • (O^^ ■ e^ as 'effective' states of 
the subsystems A and B. We define these conditional states as 



0)' 



AB B 



AB 



(34) 
(35) 



These are unnormalized states for system A and B re- 
spectively. Physically, these can be regarded as 'post- 
measurement' states on one part of the joint system, condi- 
tioned on a particular measurement outcome on the other part. 
This process of remotely preparing a state by a measurement 
on the other part of a joint state is usually referred to as 'steer- 
ing' [19 1 . It demonstrates that, when measuring only part of 
a joint system, the joint state acts as a linear map from effects 
of one side of the system to unnormalized states of the other 
part. It can be shown that the maximal tensor product coin- 
cides exactly with all possible linear maps of this form, i.e. it 
corresponds to all potential joint states that have valid condi- 
tional states for non-restricted systems |2|. This property will 
be central for the generalization of the maximal tensor product 
in the next section. 

Conditional states at A are unnormalized: they are weighted 
with the probability of obtaining the corresponding measure- 
ment outcome at B. That is, the probability accounts for the 
potential ignorance of the outcome for observers at B. Con- 
sequently, if one knows the measurement outcome in B the 
effective description of the state in A is given by the normal- 
ized conditional state: 



(0„B = 



CO" 



CO" 



The marginal state or reduced state CO^g gives the description 
of the effective state on part A of a joint state co'^^. This is a 
conditional state with e^ ~ u^, and is already normalized i.e. 



CO 



Note that this formaUsm still applies if the parts of the sys- 
tem are space-like separated, i.e. if there is no causal relation- 
ship between the measurement on the system B and the sys- 
tem A. However, the no-signaling principle states that steering 
cannot be used to transmit information, i.e. it does not allow 
for communication faster than the speed of light. The relation- 
ship between steering and the no-signaling principle is shown 
by the following theorem. First, we call a set of effects {ef}i, 
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for any system A, a perfect measurement if 

i 

An imperfect measurement is a set of effects {e^}; that is not 
a perfect measurement. 

Theorem 6. Assuming the no-signalling principle, steering 
implies that all measurements are perfect measurements. 

Proof. Consider two observers in part A and B respectively 
sharing a joint state (O^^ . The observer in B performs a mea- 
surement on his part and gets some measurement outcome 
e^j. Knowing the outcome the description of the system in 
A from his point of view is given by the normalized condi- 
tional state m'^g. The other observer 'knows' only the coarse 

graining of the different measurement outcomes. I.e. from his 
point of view the state in A is an ensemble of possible 'post- 
measurement states' {(B"^}. 

Remember that the equivalence principle gives a one-to-one 
correspondence of states and specific measurement statistics. 
Consequently, no-signaling requires the state in A after the 
measurement on B to be identical to the original marginal state 
O^g in order to prevent information transfer, i.e. 



= CO 



(36) 



where we used the definition of the normalized conditional 
state and the linearity of effects. 

Since the coarse grained conditional state needs to be equal 
to the marginal state for any joint state 



(37) 



□ 



We will use the interpretation of the maximal tensor product 
as the set of all positive linear maps to generalize it for systems 
violating the no-restriction hypothesis. 



VII. THE GENERALIZED MAXIMAL TENSOR PRODUCT 

As we have discussed, by removing the no-restriction hy- 
pothesis, the definition of a physical system now needs a spec- 
ification of both the state space and the effect set. That is, 
the set of allowed states and the set of allowed effects can be 
chosen independently — except for the constraints discussed in 
section 



111 Let us now consider the specification of joint sys- 



tems when the no-restriction hypothesis is removed. 

The definition of the minimal tensor product A+ (E)mm B+ 
makes no reference to the effect sets E'^ and E^. I.e. it is con- 
structed by products and their convex combinations. There- 
fore the minimal tensor product can be defined without as- 
suming the no-restriction hypothesis, and hence carries over 
to our more general situation. Indeed, everything that we have 
introduced for joint systems so far is valid independently of 
the no-restriction hypothesis — with one exception. 



The exception is the maximal tensor product. As before, 
we expect the maximal tensor product to comprise all joint 
states that are compatible with the given subsystems. Com- 
patibility can be broken down to two requirements: i) non- 
negative results on local effects ii) valid conditional states. For 
non-restricted systems both requirements are equivalent, as 
the no-restriction hypothesis implies consistent mappings (i.e. 
valid conditional states) if and only if local effects give non- 
negative results on joint states. Now, for the general case (i.e. 
without the no-restriction hypothesis), valid conditional states 
still guarantees non-negativity on local effects. However, the 
implication in the other direction is no longer secured. 

For example, consider attempting to use the same construc- 
tion as before, i.e. we start with the minimal tensor product 
of effects and determine all elements of the joint system that 
give positive results. The resulting elements do not depend 
on the state spaces of the single systems at all, since the ef- 
fects are decoupled from the state space due to the abandoned 
no-restriction hypothesis. Hence the resulting joint states are 
not forced to be consistent with the subsystems: we give an 
example of such a failure of consistency below. 



A. Failure of the traditional maximal tensor product 

Before generalizing the maximal tensor product we will 
show that the traditional construction rules fail for restricted 
systems. 

The traditional maximal tensor product A+ (8)maxfi+ is given 
by the dual of the set of separable effects. For restricted sys- 
tems this yields two different variants. Equation ( (3T| ) seems to 
suggest a construction based on the restricted effects, whereas 
( |32l ) utilizes the subsystems' dual cones, which are generated 
by the potential set of unrestricted effects. We show that nei- 
ther choice gives the set of all joint states consistent with re- 
stricted subsystems. 

The first variant is constructed as follows. Consider the 
restricted effects E'^ of a subsystem A with an effect cone 
E^ -.^ {Xe^\e^ e E^,X >0}. Following equation ^ we 
can construct a virtual, non-restricted system jz/ with the state 
cone given by 

^+ {o)'* ey^|e'*(to^) >0 ye^eE^}DA+ (38) 



£/*=El 



(39) 



I.e. the virtual system extends the unnormalized states, such 
that the no-restriction hypothesis is satisfied. Thus, the po- 
tential joint states from ( |3T| i, correspond actually to the tra- 
ditional maximal tensor product SSmax ^+ of the virtual 
systems £/, SS. 

Recall that the interpretation of joint states as positive linear 
maps, ^+ <8)max is exactly the set of all maps from the 
restricted effect cones {EX) to the unnormalized virtual 
states (-2^+) on the other side of the bipartite system. In 
other words, this construction includes joint states that allow 
the preparation of states in the subsystems not limited to the 
initial definition of the state spaces il'*, QP , but to those of the 
virtual systems instead. 
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For example in a bipartite system of self-dualized boxworld 
with extremal states according to ( [TT| and restricted extremal 
effects e[ = COi/2 the potential joint state 



CO' 



AB 




(40) 



gives positive values on any pair of restricted effects. How- 
ever, some conditional states are not valid for the actual sys- 
tem A, e.g. a/", = (-1,1, 1)^ i Q^. 

The second variant of the traditional maximal tensor prod- 
uct is based on the dual cones A+, B+ according to ( |32] i. The 
resulting joint states are also consistent with the restricted ef- 
fects, since the latter is included in the set of all of effects. 
However, this construction omits joint states which are con- 
sistent only with the restricted effects. For example, for self- 
dualized boxworld the identity matrix would not be included, 
although it has valid conditional states and gives positive re- 
sults on any pair of effects. 



B. Construction of the generalized maximal tensor product 

As shown above, the traditional construction rules for the 
maximal tensor product lead to inconsistencies when applied 
to theories not obeying the no-restriction hypothesis. In this 
section we shall construct a generalized maximal tensor prod- 
uct A+(g)maxB+: this will give the maximal set of joint states 
that is consistent with general subsystems, irrespective of 
whether the no-restriction hypothesis is assumed to hold. In 
other words, the generalized maximal tensor product contains 
all bipartite states whose conditional (i.e. also marginal) states 
are elements of the original state spaces. 

Definition 7. The generalized maximal tensor product of sys- 
tems A, B with primal cones A+, B+, dual cones A^, B*^ and 
effect cones E^^, is given by 



= (£+ f^min B\ U a; (g)„in EI) * 



(41) 



For the saturated case, dual cones and effect cones are iden- 
tical, and we recover the usual maximal tensor product as fol- 
lows. 

Proposition 8. Suppose that Ej^ — A\_ and Ef = B*^_. Then 

A+^maxfi+ =A+ (8)max B+. 

Proof. Under the assumptions, Eq. [4T]becomes 



= A^ 



min " + j 

B+ 



vmax 



using the definition of the maximal tensor product in pT| ). □ 

Hence our construction is indeed a generalization of the ex- 
isting definition of the maximal tensor product. It determines 
all joint states consistent with general subsystems regardless 



whether the no-restriction hypothesis holds or not. I.e. all 
joint states with valid conditional states are included, as shown 
in the following theorem. 

Tlieorem 9. Let O)^^ e V^^. Then O)^'^ e A+0maxfi+ 
has well-defined conditional states: 

(b\ e £2'* and S3\ G £2^ 



for all e^ G E^ and e^ G E^ . 
Proof. We shall show that 



(x)\ e A+ iff e {eX ®„inB* 



and that 



0) 



eB+ iffco'^^e (A;®,^in£^ 



(42) 



(43) 



Since A+(g)niaxfi+ is defined as the intersection of sets of linear 
maps (£':^(g)n,mfi+)* and (A*^ (gj^m-Bf )*, this will estabHsh 
the thesis. 

First we show the A — > B direction, i.e. ( |42| ), which is the 
statement that (is^ <X)min5+) is the set of all and only those 
joint states (O'^^ such that each O)'*^ defines a map from effects 
e^ e E^ on system A to valid unnormalized states (O^ e B+. 

Recall that for non-restricted systems the traditional maxi- 
mal tensor product is already known to give all positive linear 
maps from the effect cone of one part of the system to the 
state cone of the other part for both directions. We now show 
that (^E'X '8)min5+) can be interpreted as the traditional max- 
imal tensor product .2^ "gimax ^+ of two virtual systems jz/ 
and 3§ that obey the no-restriction hypothesis. is the vir- 
tual system that has already been introduced in ( (38] l. It has 
an extended virtual state cone since (E^)* C How- 
ever, the actual effect cone is kept, as it coincides with the 
dual cone E^ = ^/^ characterizing unnormalized effects of 
the non-restricted systems. The opposite situation applies to 
3§. Here, the effect set E'I is extended to the dual cone B*^, 
so that E^ C B*^, where B*^ representing the full set of po- 
tential unnormalized effects. However, the original state cone 
~ 3§+ is kept. With these conventions 



{Ei 



follows directly from the definition of the traditional tensor 
product in ( (32] i. Hence for the A ^ B direction, this means 
that (Zi^ ®jnm5+) contains all positive linear maps from 
the restricted effects in A to allowed states in B. That is, 
CO'^^ € (Zi^ ^minB\) is a sufficient and necessary condition 
for (0^ £ B+ which proves ( |42j l. However, for the traditional 
maximal tensor product the same joint states also coincide 
with the positive maps in the other direction ^ — > jz/, po- 
tentially including invalid mappings. 

By swapping the roles of A and B in the above argument, 
we similarly obtain that the set (A!j_ ^min-E;^) includes all 
linear maps that are consistent for the B — 7> A direction, but 
also those which lead to inconsistencies in A B opposite 
direction. Hence we obtain (|43ll. □ 
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FIG. 5. Illustration of the constmction of the generalized maximal tensor product 



Theorem|9]shows that the generalized maximal tensor prod- 
uct includes only those joint states that are consistent in both 
directions (i.e. the intersection of the sets (^E^ (8)min5+) and 
{A\ (Eimin £+)*)■ Note that the if O)'*^ has well-defined condi- 
tional states, then in particular it is is locally positive: 

{e^^e'^) [O)^^] >0 

which provides a useful necessary condition that joint states 
must satisfy. 

In the traditional GPT framework the choices of tensor 
products for states and effects are not independent, as the no- 
restriction hypothesis does not only apply to single systems, 
but to the joint system as well. Having the minimal tensor 
product for joint states (effects) does in fact constitute the 
maximal tensor product for the set of joint effects (states). 
This restriction seems inappropriate given that arbitrary sin- 
gle systems can actually be emulated by classical systems 
with constrained measurements ll20l . whereas entanglement 
is a strictly non-classical feature. 

In our modified framework that is also valid for systems 
violating the no-restriction hypothesis, this is no longer the 
case. We have seen that we can generalize the maximal ten- 
sor product, but nevertheless we are not forced to use this for 
states when we choose the minimum tensor product for effects 
and the other way round. 



C. Examples of joint systems 



4>^^ = 5 (©i (<>2 - fiJ2 ® ^2 + Ct>2 ® «3 + fi>3 <Xi tOi ) and re- 
spectively the states transformed by local symmetries. 

These entangled extremals can be interpreted as a maxi- 
mally entangled state of two such systems, as they form a iso- 
morphic map and have totally mixed reduced states. They 
correspond to a rotation of and the local symmetries of the 
state spaces. 

This theory has become very popular as it shows nonlocal 
correlations beyond those possible in quantum theory, when 
choosing between two possible binary measurements at each 
side of the bipartite systems. Let us denote the two measure- 
ments {M^} and {M^} for each of the systems A and B re- 
spectively: we index the measurements at each system with 
x,y G {0,1}. Each measurement has binary outcomes, la- 
belled with a,b E {0, 1} for systems A and B respectively. For 
example, the x = measurement on system A consists of a 
pair of effects ~ {eQ,ei} satisfying eo + ^1 = «; similarly 
for the X — I measurement on system A, and y G {0, 1} mea- 
surements on system B. This leads to a bipartite conditional 
probability distribution 



Pia,b\x,y) := {ea(E)eh)[wf 



(44) 



We define the correlation 



Q-y.:^P{a^b\x,y)~Piay^b\x,y). 

To introduce the Clauser-Horne-Shimony-Holt (CHSH) in- 
equality for demonstrating nonlocality, we introduce the pa- 
rameter 



To give some specific examples for the generalized maxi- 
mal tensor product, we have calculated it for the toy theories 
introduced in sections IV and [V] using the double description 
method fflTI . 



1. Noisy boxworld 

In the original unrestricted version of boxworld joint sys- 
tems are given by the maximal tensor product, including the 
16 extremal product states and 8 pure entangled joint states 



S := |Coo + Coi +Cio — Ciil, 

For classical systems it is upper bounded by the CHSH in- 
equality |22l 

5*^ <2, 

whereas for quantum theory it must satisfy 5^ < 2^/2 ||231 . 
However, local measurements on the maximally entangled 
state <I> in boxworld can produce correlations which reach the 
algebraic maximum 5'™^" ~ 4, i.e. the theory allows the post- 
quantum correlations known as PR boxes [ 13|. 
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For the noisy version of boxworld introduced in section IV 
there is still a notion of a maximally entangled state in the 
generalized maximal tensor product, namely 



AB 



1 1 



■ent.i -diag(-,-,l)-4>' 



(45) 



i.e. the original maximally entangled state <!>' combined with 
a mapping of the effects on one side of the system to the orig- 
inal unrestricted set. Note that this map does not undo the 
restriction of effects completely. The reversion only happens 
to occur in this particular case when mapping to states of the 
other part. On the other part, however, only restricted effects 
can be applied to. Consequently, the correlations possible 
with restricted systems will be different to those possible in 
unrestricted systems. 

Furthermore, constructing the generalized maximal tensor 
product it turns out there are 4 different classes of new pure 
joint states that are entangled but not maximally entangled. 
These are representatives of each class 



"^6111.2 

"'ent.3 



-aa)2®a)2 + j3a)2®a)4 + j3a)4®a)2-aa)4®(»4 

-aC02<E)C02+P(02<^(03+P(04<i^(l>2-CC(04<i^(03 



-a COj (E) C02 + P (04 (S) (Oi + P (04 (S) Oh - 01 (04 ® (Oi 

1-A „ 1+A 



®ent.5 



with a 



4A 



4A 



(46) 



where the other elements of the class only differ by the local 
symmetries. 

In conclusion the generalized maximal tensor product is 
spanned by 96 pure states. Namely, it consists of 16 local 
pure states, 8 pure entangled states of class W^j^j, 8 of class 
(O^nti' 16 of class 0)4^3, 16 of class (0^^^^ and 32 states of 
class <f,5. 

Considering local measurements on one instance of any of 
the nonlocal extremal states the maximal CHSH violation 5^ 
as a function of the parameter A of the restricted model can be 
shown to be 4 A ^. Note that this bound is only guaranteed for 
the correlations that occur from direct measurements. How- 
ever, it is known that wiring the measurements on multiple 
joint states via classical post-processing, might give rise to a 
distillation of correlations beyond for some values of A ll24l . 



2. Self-dualized polygons 

Interestingly, not only boxworld but all bipartite polygon 
systems allow a joint state with features known from the max- 
imally entangled state of ordinary quantum theory. Namely, 
the linear maps corresponding to these states are given by 
isomorphisms of the dual and primal cones with maximally 
mixed reduced states. The 2n different maximally entan- 
gled states correspond to the elements of the dihedral group. 
For even n, the maximally entangled states include an addi- 
tional rotation of n/n mapping the dual cone of one part to 
the primal cone of the other part. It was shown that non- 
local correlations based on two binary local measurements 



on the maximally entangled states at each side show corre- 
lations strictly weaker than quantum correlations for the odd 
case, whereas the unrestricted even case shows correlations as 
strong as those of quantum theory or stronger |9|. 

Replacing the original polygon systems with even n by their 
self-dualized versions, the maximally entangled states lose the 
additional rotation as the new effect cone and the state cone 
coincide. Note, that the self-dualized single systems become 
subtheories of the theory given in the limit n — > oo, i.e. the 
quantum case, as both states and effects form strict subsets. 
Thus, the correlations on the maximally entangled state form 
a strict subset of those in quantum theory, in contrast to the un- 
restricted case which allows post-quantum correlations. Even 
though the restricted polygons are not genuine strongly self- 
dual but only self-dualized, this is consistent with the conjec- 
ture in [9j, that strong self-duality limits correlations. 

For self-dualized boxworld the generaUzed maximal tensor 
product is given by the 16 local pure states, the 8 states (O^^^ j 
representing the identity and symmetry mappings as well as 
a class of 64 pure entangled states (0^^^2 = 1 /4(— fi)i (E)(Oi + 

(0i(E)(03+2(O2(E)(04 + (Oi(E}(0i-Oh,(E)(O3+2(04(E) (B2). 

Unfortunately, using the double description method, we 
were not able to characterize all extremals of the generalized 
maximal tensor product for polygon systems with a higher 
number of vertices. 



3. Spekkens 's toy theory 

The Spekkens theory that we introduced earlier is a local 
theory, meaning that (in the probabilistic version) it cannot vi- 
olate any Bell inequalities. However, as discussed in 1 15 1, the 
Spekkens theory has entangled states. This raises the question 
of why the Spekkens theory does not exhibit bipartite nonlo- 
cality. In contrast, a classical theory, i.e. a simplex, is local 
but it does not have entangled states. One could then ask, 
given that the Spekkens theory has entangled states, but is lo- 
cal, what must be added to the definition of the theory to make 
it nonlocal? 

In our framework, the answer to this question can be 
clearly understood in terms of the geometry of the state space. 
First, recall that the state space Qf^ of a single system in the 
Spekkens theory is an octahedron, and the effect space is 
identically the same, i.e. not the full dual space. Consider 
a pair of single systems A and B in the Spekkens theory. Since 
the effect space E'^ is not the full dual space A^, we must use 



AB _ 



vB I to define the 



the generalized tensor product Q: 
bipartite states. Then consider the following bipartite state: 



(0^" = 



/O 

- 
^ 









1 _ 1 

2 5' 

1/ 



(47) 



It is straightforward to verify that O)'*^ leads to well-defined 
conditional states for system B for all effects e E^, i.e.: 
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and correspondingly for conditional states for system A when 
using effects on system B. In particular, it is also easily 
checked that {e^ (g) e^) [o'^^] > for any pair of effects and 
e^. Hence by Theoremjo] this shows that (o'^^ is in the gener- 
alized tensor product A+(E)m^xB+ for the Spekkens theory. 

Now, since the Spekkens theory is local, the CHSH inequal- 
ity ( |V11C l| l is satisfied for any choice of measurements 
and My on the state co'^^, or any other bipartite state. However, 
let us consider the unrestricted effect space A*^ from which the 
restricted space for the Spekkens theory was derived. The 
unrestricted effect space of the octahedron is the cube. We can 
represent the normalised extremal effects as the vertices of a 
cube: 




Now, suppose that we use the cube to be the effect space for 
the octahedron, i.e. we use the full dual space. It is easily 
shown that the state O)'*^ defined in Eq. 
eralized maximal tensor product 



47 



given by Mq 



is again in the gen- 
xB+. However, we 
can now provide measurements which violate the CHSH in- 
equality. In particular, consider two measurements for Alice 
{eo,M — eo} andMj^ — {ei,u — ei} where: 



and two measurements for Bob given by Mq — {eo,u — eo} 



and M[ = {.2, « ~ .2}, where: 



By using these choices of measurements in Eq. 44 and the 
following equations, we obtain the value of the CHSH param- 
eter: this is 5 = 4. This is the value attained by PR boxes, and 
hence using the full effect space essentially yields the same 
nonlocality as boxworld. 

We therefore see that the Spekkens theory can be embed- 
ded into a nonlocal theory by embedding the effect space of 
single system into the full dual cone. Moreover, we see com- 
pleting the Spekkens theory in this way yields boxworld. This 
provides a new understanding of why the Spekkens theory is 
local: the measurements are too restricted. 



VIII. CONCLUSIONS 

We have extended the framework of generalized probabilis- 
tic theories. Given an arbitrary state space the traditional 
framework determines the possible measurement outcomes as 
corresponding to the complete set of probability valued linear 
functionals on states. In contrast to the traditional framework, 
our generalization allows the set of states and and the set of ef- 
fects to be defined separately. As a result the upper bound for 
the set of joint states, known as the maximal tensor product, is 
no longer valid in its traditional form, but has to be replaced 
by a generalized version. 

As an application for restricted models, we provided a self- 
dualization procedure that alters any theory by restricting the 
set of effects, such that states and the restricted effects are sim- 
ilarly related as states and unrestricted effects in strongly self- 
dual systems. We introduce specific examples for which the 
self-dualization does not only give a formal resemblance but 
reproduces a phenomenon called bit symmetry shown to only 
hold for strongly self-dual systems in the traditional frame- 
work |14|. Furthermore, these self-dualized models show 
quantum correlations, whereas the original models have corre- 
lations that are stronger than quantum correlations. In partic- 
ular, the correlations of boxworld — a theory known to allow 
correlations only restricted by the no-signalling principle — 
has classical correlations if self-dualized, even though the 
generalized maximal tensor product includes maximally en- 
tangled states. We showed how the Spekkens theory is related 
to this model, since it is also self-dual and violates the no- 
restriction hypothesis: but were it to satisfy this principle, by 
taking the full dual cone, it would produce nonlocal correla- 
tions. 

As another application for restricted models, we show that 
restrictions can be used to alter theories, such that their mea- 
surements are inherently noisy. This is different to the unre- 
stricted theories, since in our noisy theories it holds that for 
pure states there is no non-trivial extremal effect occurring 
with certainty. We derive the maximal CHSH violation 
of a noisy version of boxworld as a function of a noise param- 
eter 

The modified framework is therefore suitable for examining 
new situations that could not be addressed using the traditional 
framework. In particular the self-dualization procedure might 
be useful for the study of strong self-duality that has recently 
received much interest ll2ll9l fT4l . 
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